
This README describes the backup for the article "Knowledge access: The effects of Carnegie libraries on innovation" by Enrico Berkes and Peter Nencka


#########

User commands

To run the scripts in this folder, you will need the following user-provided commands for Stata: 

blindschemes 
estout 
esttab 
ftools
gtools 
parmest 
regsave 
reghdfe 


Other analysis is completed in R and Python. For those languages, the needed user-provided packages are listed at the top of each script.

#########

Note on restricted data:

We use restricted access census data thoughout this paper. This data is needed to construct our final analysis sample. You can find information on accessing this data here:
https://www.nber.org/research/ancestrycom-and-ipums-complete-count-restricted-file

We include all scripts needed to process the restricted data if you have access to the NBER server. We also include the scripts that, given the restricted input, will construct the analysis sample and conduct analysis.


############
Organization

There are two folders in this directory. 

The analysis folder is called "~2023.10_conditional_accept_upload"
The data folder is called "~data_restat_upload"

Folders in the "data" subfolder each correspond to the named dataset in that folder. 

Files in the "~2023.10_conditional_accept_upload" folder create the analysis tables and figures used in the paper. We describe them in the "analysis" section below.

To run, you will need to edit the paths to match the location and name of these high-level folders on your computer. Scripts use the folder names "2023.10_conditional_accept" and "data_restat"

#########
Data folders:

\~Combined
	*Combines data from subsequently listed folders into an analysis dataset that is used in the analysis folder below.

\ALA
	*Contains information on the types of books contained in a 1904 American Library Association catalog 

\Carnegie libraries
	*Creates dataset of libraries and descriptive library figures
        
\Census
	*Creates county-level census dataset, from IPUMS extract

\City distances
	*Creates distance between library cities and cities in the patent data for distance exercises

    *Note: Takes data from "Carnegie libraries" and "patent_data" as inputs

    *Note: To re-run, you will need to provide your own Google Maps API key in script "02 Geocode library cities.R". As of 2023, this geocoding step should be within Google's free monthly usage limits.

\City populations
	*Contains city population data

\City_Covariates
  	*Contains restricted data processing using the historical census data

    *See "Note on restricted data" above for additional details

\college_data
  	*Contains information on college locations

    *Note: To re-run, you will need to provide your own Google Maps API key in script "02 Geocode cities.R". As of 2023, this geocoding step should be within Google's free monthly usage limits.

\elections

  	*Contains information on historical elections

\library_books

  	*Contains library book holdings for five libraries from the Main Street Public Library Dataset

\maps 

  	*Contains map shapefiles

 \Newspapers

  	*Contains information on historical daily newspapers, from the United States Newspaper panel

 \patent_data

  	*Contains information on patents, from the Comprehensive Universe of U.S. Patents 

 \Unions

  	*Contains information on unions and the Knights of Labor, provided by Luca Bittarello



#########
Analysis folder:

The analysis folder creates the tables and figures in the paper. Below, we list each figure and table in the draft and highlight the corresponding script (and output file) that corresponds to that figure.

When not explictly indicated, multiple figure-graphs are listed in order by panel (i.e., A to F)

**Main draft figures** 

    *Figure 1: Comparison of city characteristics between Carnegie and rejecting cities and Carnegie and non-applicant cities
        Panel A: 01_carnegie_rejected_results.do, 03_summary_stat_carn.eps
        Panel B: 06_all_cities.do, 03_summary_stat_apply.eps


    *Figure 2: Average patenting activity in treatment and control groups relative to grant years
        Panel A: 01_carnegie_rejected_results.do, 03carnegie_ihs_gy.eps
        Panel B: 01_carnegie_rejected_results.do, 03carnegie_count_gy.eps

    *Figure 3: Event study estimates of Carnegie libraries on patenting for alternative patent measures
        Panel A-F: 01_carnegie_rejected_results.do
                ihs_gryr.png, ln_gryr.png, patents_gryr.png, ppml_gryr.png, ihs0_gryr.png, ln0_gryr.png

    *Figure 4: Event study estimates of Carnegie libraries on patenting for alternative patent measures (yearly)
        Panel A-F: 01_carnegie_rejected_results.do
                yearly_ihs_gryr.png, yearly_ln_gryr.png, yearly_patents_gryr.png, yearly_ppml_gryr.png, yearly_ihs0_gryr.png, yearly_ln0_gryr.png

    *Table 1: Effect of Carnegie libraries on patenting
        01_carnegie_rejected_results.do
            ihs_patents.tex, ihs_patents_1930.tex

    *Table 2: Difference-in-differences estimates by patent classes
        01_carnegie_rejected_results.do
            class_hetsy.tex, class_hetgy.tex

    *Table 3: Difference-in-differences estimates by patent classes
        01_carnegie_rejected_results.do
            cited_booksy.tex, cited_bookgyy.tex, multi_inventorsy.tex, multi_inventorgyy.tex


**Online Appendix Figures** 

    *Figure A1: Cumulative distribution for cities that built a Carnegie library and cities that rejected a Carnegie grant by grant-year
         ~data_restat_upload/Carnegie libraries/Scripts/02_Create_graphs.do
            cumulative_accepted_libraries.eps, cumulative_rejected_libraries.eps
    
    *Figure A2: Map of all built and rejected Carnegie libraries
        09_make_city_maps_figures_a2_a3
            libraries_treated_v_control.png

    *Figure A3: Map of built Carnegie libraries over time
        09_make_city_maps_figures_a2_a3
            libraries_over_time.png

    *Figure A4: Distribution of time required to construct libraries after library grants
        Created in ~data_restat_upload/Carnegie libraries/Scripts/02_Create_graphs.do
            years_build_library.eps

    *Figure A5: Spillover effects of Carnegie libraries on patenting in nearby areas
        03_distance_gradient.do
            spillover_sygy_bins.eps

    *Figure A6: Spillover effects of Carnegie libraries on patenting in nearby areas, non-intersecting distance bins
        03_distance_gradient.do
            spillover_sygy_bins.eps
    
     *Figure A7: Distribution of technical books in the 1904 ALA catalog
        10_dewey_distribution_in_ala_figure_a7
            ala_books_dewey_bar.png

     *Figure A8: Distribution of technical books in the historical book catalog of 5 libraries
        11_dewey_distribution_in_library_catalogs_figure_a8
            LexingtonMI_bar.png, MorrisIL_bar.png, OsageIA_bar.png, RhinelanderWI_bar.png, SaukCenterMN_bar.png, all_catalogs_bar.png

     *Figure A9: Event study estimates of Carnegie libraries on patenting for alternative patent measures, models with city and state-year fixed effects
        Panel A-F: 01_carnegie_rejected_results.do
                ihs_sy.png, ln_sy.png, patents_sy.png, ppml_sy.png, ihs0_sy.png, ln0_sy.png

     *Figure A10: Event study estimates of Carnegie libraries on patenting for alternative patent measures, models with city and state-year fixed effects (yearly)
        Panel A-F: 01_carnegie_rejected_results.do
                yearly_ihs_sy.png, yearly_ln_sy.png, yearly_patents_sy.png, yearly_ppml_sy.png, yearly_ihs0_sy.png, yearly_ln0_sy.png

     *Figure A12: Event study estimates of Carnegie libraries on patenting, Gardner method
        Panel A-B: 01_carnegie_rejected_results.do
                gardner_sy.png, gardner_sy0.png

     *Figure A13: Event study estimates of Carnegie libraries on patenting for alternative patent measures, stacking approach
        Panel A-D: 07_stacked
                stacked_sygy.png, stacked_sygy0.png, stacked_count_sygy, stacked_ppml_sygy

     *Figure A14: Event study estimates of Carnegie libraries on patenting for alternative patent measures, stacking approach using +1/-1 year window
        Panel A-D: 07_stacked
                stacked_close_sygy.png, stacked_close_sygy0.png, stacked_close_count_sygy, stacked_close_ppml_sygy

     *Figure A15: Share of solo-authored U.S. patents by filing year
        05_pat_graphs
            single_author.eps

     *Figure A16: Share of U.S. patents by Cooperative Patent Classification technology class by decade
        05_pat_graphs
            by_cpc.eps
    
     *Figure A17: Null treatment effect estimate distribution and actual treatment effect estimates (dashed line)
        08_all_cities_power
            placebo_sy.png, placebo_sygy.png

    *Table A1: Summary statistics
        01_carnegie_rejected_results.do
            summary_stat1.tex, summary_stat2.tex

    *Table A2: Effect of Carnegie library entry on local newspaper market and vote shares
        02_carnegie_rejected_results_news.do
            ~2023.10_conditional_accept_upload/Log/newspapers

    *Table A3: Robustness of difference-in-differences results to alternative specifications
        01_carnegie_rejected_results.do
            ~2023.10_conditional_accept_upload/Log/main
   
    *Table A4: Effect of Carnegie libraries on measures of patent quality
        01_carnegie_rejected_results.do
            any_forward_citation.tex, forward_citations.tex, had_break_p90_rrfsim010.tex

    *Table A5: Effect of Carnegie libraries on women and immigrant patenting
        01_carnegie_rejected_results.do
            total_female_patents.tex, share_female.tex, total_immigrant_patents.tex, share_immigrant.tex

    *Table A6: Effect of Carnegie libraries on patenting, extensive margin
        01_carnegie_rejected_results.do
            got_patent.tex, got_patent_1930.tex
    
    *Table A7: Effect of Carnegie libraries on patenting, extensive margin (first-time inventors) 
        01_carnegie_rejected_results.do
            first_patent.tex, share_first_patent.tex

    *Table A8: Heterogeneity in library difference-in-differences estimates across city characteristics
        01_carnegie_rejected_results.do
            ~2023.10_conditional_accept_upload/Log/main

    *Table A9: Robustness of difference-in-differences patent results to alternative samples and post period lengths
        01_carnegie_rejected_results.do
            sample_r_add_sy, sample_r_add_sygy, sample10_r_add_sy, sample10_r_add_sygy, robust_sy, robust_sy10, robust_sygy, robust_sygy10

    *Table A10: Robustness of difference-in-differences patent results to alternative samples and post period lengths
        01_carnegie_rejected_results.do
            ~2023.10_conditional_accept_upload/Log/main

    *Table A11: Effect of Carnegie libraries on patenting, log patents
        01_carnegie_rejected_results.do
            ln_patent_count, ln_patent_count_1930
     
     *Table A12: Effect of Carnegie libraries on patenting, patent counts
         01_carnegie_rejected_results.do
            patent_count, patent_count_1930
    
     *Table A13: Effect of Carnegie libraries on patenting (aggregated model)
        01_carnegie_rejected_results.do
            ~2023.10_conditional_accept_upload/Log/main
   
       
   



